PYTHON INTRODUCTION¶

Overview:

  • Python Language Basics;
  • Built-in Data Structures;
  • Control Flow statements;
  • Functions; and Files.
In [ ]:
print("Hello world")

1. Python Language Basics¶

1.1. Why Python is the best language for data analyst¶

Python is a interpreted, high level programming language created by Dutch programmer Guido van Rossum and released in 1991. Python is a programming language that lets you work more quickly and integrate your systems more effectively.:

1. Easy to learn with simpler syntax:</br> 2. Large number of open source libraries to assist us in every stage of data analyst</br> 3. Wider community support</br> 4. One for All</br> 5. Large availability of learning resources and materials</br>

1. Easy to learn with simpler syntax:

Python has one of the simplest syntax among all programming languages. It is more like simple English as opposed to the complex syntax of Java or C++ making it very beginner friendly and easy to learn,so anyone starting can learn it more easily and in a relatively short period of time.

2. Large number of open source libraries to assist us in every stage of data analyst

Python has a large number of open source libraries which contain a large number of predefined functions to perform many of the operations required in the data analyst which saves our time and efforts to write code each time for different operations (numpy, pandas, matplotlib, seaborn...)

3. Wider community support:

Python is one of the most widely used languages in various fields resulting into a very large,widespread and diverse community,so it is easier to get community support if you face any issues with a particular tool or encounter any problem while working upon your project. Python document is open and easy to understand the link here.

4. One for all

This is a key point which distinguishes Python from its competitors. While working on a data analyst project, it is not only all about data analyst. For example, in real world data is always not well organized and presented in forms of excel sheets and csv files. Most of the time,data needs to be extracted from various different sources and its insights and results need to be presented in interactive ways across different platforms like a website,...So,it will require the integration of the processes of web scraping, data wrangling, cleaning... This integration is easier in Python because it has rich libraries resources for these tasks as well apart from Data Analyst making it easier for the developer,the feature which is lacking in most other languages.

Built-in libraries and external libraries are powerful for many tasks, especially for data science, data engineer and data analysis:

  • Scientific computing: numpy, numba, etc.
  • AWS interaction: boto3
  • Image processing: opencv, pillow, albumentation, etc.
  • Machine learning and Deep learning framework: tensorflow, pytorch, onnx, sklearn, etc.
  • Data processing and visualization: pandas, pyyaml, matplotlib, seaborn etc.

5. Large availability of learning resources and materials

Python has a large number of online learning materials and resources many of which are beginner friendly thus providing the programmer a large number of options to choose from in their learning journey.

In Stack Overflow survey 2021 image-2.png % of developers who are not developing with the language or technology but have expressed interest in developing with it

Python is language that sort of grows with you. So there's little effort in getting to started but there's no limit to where you can go!

1.2 Language Semantics¶

  • Python semantics
    • Indention, not braces
    • Everything is an object
    • Comments
    • Variable
    • Dynamic references, strong types
    • Attributes and methods
    • Duck typing
    • Import
    • Mutable and immutable objects

Indention, not braces¶

Python uses whitespace (tabs or spaces) to structure code instead of using braces as in many other languages like R, C++, Java, and Perl. A colon denotes the start of an indented code block after which all of the code must be indented by the same amount until the end of the block

Four spaces as your default indentation and replacing tabs with four spaces. (Jupyter notebook: Ctrl + ])

In [1]:
for i in [1,2,3]:
    print(i)
    print("Block 1")
    
    
print("--End block 1--")
1
Block 1
2
Block 1
3
Block 1
--End block 1--
QUIZ: With these codes following, what is the output? ```python print("This is line 1") print("This is line 2") ``` * [ ] **A** ``` This is line 1 This is line 2 ``` * [ ] **B** ```python IndentationError: unexpected indent ```
Answer B

Everything is an object¶

Every number, string, data structure, function, class, module, and so on exists in the Python interpreter in its own “box,” which is referred to as a Python object. Each object has an associated type (e.g., string or function) and internal data.

In [2]:
type("Hello world")
Out[2]:
str
In [3]:
type(1)
Out[3]:
int

Comments¶

Any text preceded by the hash mark (pound sign) # is ignored by the Python interpreter.

In [4]:
# Câu lệnh này để in ra đoạn Hello world
# print("Hello world")

print("Hello world")   # print("Hello world")
Hello world
QUIZ: How many string "Hello world" print out? ```python #print("Hello world") print("Hello world") print("Hello world") ```
Answer 2

Variable¶

One of the most powerful features of a programming language is the ability to manipulate variables. A variable is a name that refers to a value. </br> The assignment statement gives a value to a variable.

The assignment token, =, should not be confused with equals, which uses the token ==

In [6]:
message = "What’s up, Doc?"
n = 17
pi = 3.14159
In [7]:
sum_nums = 5 + 3
In [8]:
print(sum_nums)
8
In [9]:
sum_nums = 10
In [10]:
print(sum_nums)
10

Variable names can be arbitrarily long. Some rules while naming Python variable:

  • Python variable name can contain only alphabet a-z A-Z, digit 0-9 and underscore _, cannot contain special character ! @ # $ % ^ & * ( ) - = + { } : " ; , . / ?
  • Python variable name cannot start with a digit, it can only start with alphabet or underscore. Example: 1abc is bad name, abc1 and _1abc are good names.
  • Python variable name is case-sentitive. Example: abc, Abc, aBC are different variables.
  • Can't used keyword

List of keywords

and as assert break class continue def del elif else except
False finally for from global if import in is lambda None
nonlocal not or pass raise return True while with yield
In [11]:
tcb = 5000000
In [12]:
type(tcb)
Out[12]:
int
In [13]:
tcb = "hello world"
In [14]:
type(tcb)
Out[14]:
str
In [7]:
xez_123 = 1000
xez_123
Out[7]:
1000
QUIZ: Which are invalid names in Python? - [ ] **A** MyVariable - [ ] **B** M4Varialbe - [ ] **C** 4Varibale - [ ] **D** My$Var - [ ] **E** sum - [ ] **F** my varible
Answer C, D, E, F

Dynamic references, strong types¶

In contrast with many compiled languages, such as Java and C++, object references in Python have no type associated with them.

Attributes and methods¶

Objects in Python typically have both attributes (other Python objects stored “inside” the object) and methods (functions associated with an object that can have access to the object’s internal data). Both of them are accessed via the syntax

obj.method_name

In [15]:
c = "Hello World"
In [9]:
type(c)
Out[9]:
str
In [16]:
c.upper()
Out[16]:
'HELLO WORLD'
In [17]:
dir(c)
Out[17]:
['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'removeprefix',
 'removesuffix',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']
In [18]:
help(c.upper)
Help on built-in function upper:

upper() method of builtins.str instance
    Return a copy of the string converted to uppercase.

dir(object)
    This function will return all the properties and methods, even built-in properties which are default for all object.
help(object.method)
    Returns a help page.

Equivalent</br> Shift + Tab

In [20]:
help(c.upper)
Help on built-in function upper:

upper() method of builtins.str instance
    Return a copy of the string converted to uppercase.

In [ ]:
c.upper

Duck typing

Often you may not care about the type of an object but rather only whether it has certain methods or behavior. This is sometimes called “duck typing,” after the saying “If it walks like a duck and quacks like a duck, then it’s a duck.”

Import

Three different ways to import names into the current namespace, and to use them:

In [22]:
# Single identifier random is added to the current namespace. 
import random
random.randint(1,6)
Out[22]:
2
In [23]:
random.randint(1,100)
Out[23]:
77
In [25]:
from random import randint
randint(1,6)
Out[25]:
5
In [26]:
# We can make things shorter by importing a module under a different name
import random as rd
rd.randint(1,6)
Out[26]:
1
In [ ]:
# NOT RECOMMEND
# from random import *
# randint(1,6)

Mutable and immutable objects¶

  • Mutable is the object or values that they contain canbe modified (dict, list)
  • Others, like strings and tuples, are immutable

1.2. Scalar Types¶

  • Numeric types
  • Strings
  • Bytes and Unicode
  • Booleans
  • None

Standard Python scalar types

Type Description Example
int Arbitrary precision signed integer 3, 5, 100
float Double-precision (64-bit) floating-point number (note there is no separate double type) 3.0, 1.2, 108.3
str String type; holds Unicode (UTF-8 encoded) strings "Hello world", "3.0", "5"
None The Python “null” value (only one instance of the None object exists) None
bool A True or False value True, False
bytes Raw ASCII bytes (or Unicode encoded as bytes) b'espa\xc3\xb1ol'

1.2.1. Numeric types (int, float)¶

The primary Python types for numbers are int and float.

In [1]:
ival = 871
In [2]:
fval = 871.0
In [3]:
type(ival)
Out[3]:
int
In [4]:
type(fval)
Out[4]:
float
In [42]:
# Integer division not resulting in a whole number will always yield a floating-point number:
3/2
Out[42]:
1.5
In [43]:
1 + 3 
Out[43]:
4
In [44]:
4/2
Out[44]:
2.0
In [45]:
1 + 2
Out[45]:
3
In [46]:
5 *6 
Out[46]:
30

1.2.2. Strings¶

  • You can write string literals using either single quotes ' or double quotes "
  • For multiline strings with line breaks, you can use triple quotes, either ''' or """
  • Strings are immutable; you cannot modify a string
In [5]:
a = 'one way of writing a string'
b = "another way"
c = """
This is a longer string that
spans multiple lines
"""
In [7]:
' tôi nói:"xin chào" '
Out[7]:
' tôi nói:"xin chào" '
In [8]:
a = 'I said: "Hello"'

Indexing

In [9]:
a[2]
Out[9]:
's'
In [11]:
a[-5]
Out[11]:
'e'

Slicing

In [15]:
a[8:15]
Out[15]:
'"Hello"'
In [17]:
a[2:6]
Out[17]:
'said'
In [18]:
a[8:]
Out[18]:
'"Hello"'
In [56]:
a[-3]
Out[56]:
'i'
In [61]:
a[-6:]
Out[61]:
'string'
In [52]:
a[5]
Out[52]:
'a'
In [11]:
a
Out[11]:
'one way of writing a string'
In [ ]:
a = "Hello world"
In [ ]:
a.upper()
In [ ]:
a = a.upper()
In [ ]:
a

We can access every character in a string by its index (index means the position, from 0 to the length - 1) string[index]

In [ ]:
a
In [ ]:
a[4]
In [ ]:
a[-16:-9]

We can use negative index to access character from the end of a string

In [ ]:
a[-27]
In [ ]:
len(a)-1
In [ ]:
-1
In [ ]:
a[27]

The syntax string[:3] is called slicing and is implemented for many kinds of Python sequences.

In [ ]:
a = 'one way of writing a string'
In [ ]:
a[4:7]

Adding two strings together concatenates (+) them and produces a new string

In [ ]:
a + " "+ b

String methods¶

Method Description Example
.capitalize() Converts the first character to upper case a = ' happy birthDay ' </br> >>> a.capitalize()</br> 'Happy birthDay '
.upper() Converts a string into upper case >>> a.upper()</br> ' HAPPY BIRTHDAY '
.lower() Converts a string into lower case >>> a.lower()</br> ' happy birthday '
.strip() Returns a trimmed version of the string >>> a.strip()</br> 'happy birthDay'
.rstrip() Returns a right trim version of the string >>> a.rstrip()</br> ' happy birthDay'
.lstrip() Returns a left trim version of the string >>> a.lstrip()</br> 'happy birthDay '
.split() Splits the string at the specified separator, and returns a list >>> a.split()</br> ['happy', 'birthDay']
.title() Splits the string at the specified separator, and returns a list >>> a.title()</br> ' Happy Birthday '

More detail method for string the link here

In [24]:
string = "   one way of writing a string   "
In [22]:
string.split()
Out[22]:
['one', 'way', 'of', 'writing', 'a', 'string']
In [23]:
string.title()
Out[23]:
'One Way Of Writing A String'
In [25]:
string.strip()
Out[25]:
'one way of writing a string'
In [26]:
string.upper()
Out[26]:
'   ONE WAY OF WRITING A STRING   '
In [27]:
dir(string)
Out[27]:
['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'removeprefix',
 'removesuffix',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']
In [28]:
help(a.split)
Help on built-in function split:

split(sep=None, maxsplit=-1) method of builtins.str instance
    Return a list of the words in the string, using sep as the delimiter string.
    
    sep
      The delimiter according which to split the string.
      None (the default value) means split according to any whitespace,
      and discard empty strings from the result.
    maxsplit
      Maximum number of splits to do.
      -1 (the default value) means no limit.

In [ ]:
a = 'one way of writing a string'
In [ ]:
a.split()
In [ ]:
a.split('f')

Note

string.split(separator, maxsplit)
    separator   Optional. Specifies the separator to use when splitting the string. By default any whitespace is a separator
    maxsplit    Optional. Specifies how many splits to do. Default value is -1, which is "all occurrences"
In [ ]:
# Splits the string at the specified separator, and returns a list
a.title()
In [ ]:
d = "    hello     world   "
In [ ]:
d.strip()
In [ ]:
a.index("w")
In [65]:
a + b
Out[65]:
'one way of writing a stringanother way'
In [66]:
print(a+ " " +b)
one way of writing a string another way

String format¶

We can combine other data type variable to print out with string format. Use simple .format() method:

In [29]:
level = 1

a = 'MCI Python level '

a + level
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [29], in <cell line: 5>()
      1 level = 1
      3 a = 'MCI Python level '
----> 5 a + level

TypeError: can only concatenate str (not "int") to str
In [30]:
level = 1

a = 'MCI Python level {}'.format(level)
print(a)
MCI Python level 1
In [69]:
level = 1
course = 'Python'

a = 'MCI {} level {}'.format(course, level)
print(a)
MCI Python level 1

Use format() method with the index:

In [45]:
num_1 = 15
num_2 = 4
In [ ]:
a = 'Sum of {} and {} is equal to {}'.format(num_1, num_2, num_1 + num_2)
print(a)
In [ ]:
a = 'Sum of {0} and {1} is equal to {2}'.format(num_1, num_2, num_1 + num_2)
print(a)
In [ ]:
a = 'Sum of {1} and {2} is equal to {0}'.format(num_1 + num_2, num_1, num_2)
print(a)

QUIZ:

Write code that combines the following variables so print the output (Use `.format`)

“Python is powerful... and fast; plays well with others; Python is powerful... and fast; plays well with others; runs everywhere; is friendly & easy to learn; is Open.”

a = 'Python is powerful... and fast'
b = 'plays well with others'
c = 'runs everywhere'
d = 'is friendly & easy to learn'
e = 'is Open.'
Answer
     
a = 'Python is powerful... and fast'    
b = 'plays well with others'    
c = 'runs everywhere'    
d = 'is friendly & easy to learn'    
e = 'is Open'    
print("{0}; {1}; {2}; {3}; {4}.".format(a,b,c,d,e))
    

1.2.3. Bytes and Unicode¶

  • Unicode has become the first-class string type to enable more consistent handling of ASCII and non-ASCII text.
  • Unicode encoding of a bytes object, you can go back using the decode method
In [ ]:
a = "xin chào"
In [ ]:
val = "español"

val_utf8 = val.encode('utf-8')

val_utf8.decode('utf-8')

1.2.4. Booleans¶

The two boolean values in Python are written as True and False.

In [ ]:
True and True

1.2.5. None¶

None is the Python null value type. If a function does not explicitly return a value, it implicitly returns None

In [ ]:
a = None
a is None

1.2.6. Dates and times¶

A date in Python is not a data type of its own, the built-in Python datetime module provides datetime, date, and time types. The datetime type combines the information stored in date and time The datetime module

In [50]:
import datetime as dt
dt = dt.datetime(2011, 10, 29, 20, 30, 21)
dt
Out[50]:
datetime.datetime(2011, 10, 29, 20, 30, 21)
In [51]:
type(dt)
Out[51]:
datetime.datetime
In [52]:
dt.day
Out[52]:
29
In [53]:
dt.minute
Out[53]:
30
QUIZ: What is the type in the list? ```python 1. 15 2. 15.0 3. '15' 4. tcb ```
Answer 1:integer; 2:float; 3:string; 4:variable

1.3. Binary operators and comparisons¶

  • Arithmetic operators
  • Comparisons
  • Combining conditions with logical operators

Operators are special tokens that represent computations like addition, multiplication and division.</br> The values the operator works on are called operands.

1.3.1 Arithmetic operators¶

Python supports the following arithmetic operators:

Operator Purpose Example Result
+ Addition 2 + 3 5
- Subtraction 3 - 2 1
* Multiplication 8 * 12 96
/ Division 100 / 7 14.28..
// Floor Division 100 // 7 14
% Modulus/Remainder 100 % 7 2
** Exponent 5 ** 3 125
In [54]:
5/3
Out[54]:
1.6666666666666667
In [55]:
5//3
Out[55]:
1
In [56]:
5%3
Out[56]:
2
In [ ]:
2 * 3 ** 4 / 6
In [ ]:
8%5
The order of evaluation depends on the rules of precedence.
  1. Parentheses have the highest precedence
  2. Exponentiation has the next highest precedence
  3. Multiplication and both Division operators have the same precedence, which is higher than Addition and Subtraction
  4. Operators with the same precedence are evaluated from left-to-right. In algebra we say they are left-associative
In [70]:
20 + 4 * 10
Out[70]:
60
In [71]:
(20 + 4) * 10
Out[71]:
240
In [72]:
(2 *(3 ** 4)) / 6
Out[72]:
27.0
In [73]:
3 ** 4 
Out[73]:
81
In [74]:
2*81
Out[74]:
162
In [ ]:
162/6
QUIZ: What is the value of the following expression: 16 - 2 * 5 // 3 + 1
- [ ] A. 14 - [ ] B. 24 - [ ] c. 3 - [ ] c. 13.667
Answer 14

1.3.2 Comparisons¶

Python also provides several operations for comparing numbers & variables.

Operator Description
== Check if operands are equal
!= Check if operands are not equal
> Check if left operand is greater than right operand
< Check if left operand is less than right operand
>= Check if left operand is greater than or equal to right operand
<= Check if left operand is less than or equal to right operand

The result of a comparison operation is either True or False (note the uppercase T and F). These are special keywords in Python.

In [75]:
a = 5
b = 4
In [76]:
a > b
Out[76]:
True
In [77]:
a >= b
Out[77]:
True
In [78]:
a < b
Out[78]:
False

1.3.3 Combining conditions with logical operators¶

The logical operators and, or and not operate upon conditions and True & False values (also known as booleans). and and or operate on two conditions, whereas not operates on a single condition.

The and operator returns True when both the conditions evaluate to True. Otherwise, it returns False.

a b a and b
True True True
True False False
False True False
False False False
In [79]:
x = 3
In [80]:
x > 0 and x < 5
Out[80]:
True
In [81]:
x = 6
In [82]:
x > 0 and x < 5
Out[82]:
False

The or operator returns True if at least one of the conditions evaluates to True. It returns False only if both conditions are False.

a b a or b
True True True
True False True
False True True
False False False
In [ ]:
x = 4
In [ ]:
x > 0 or x < 5
In [ ]:
x = 6
In [ ]:
x > 0 or x < 5

The not operator returns False if a condition is True and True if the condition is False.

In [83]:
x = 3
not x <5
Out[83]:
False

Logical operators can be combined to form complex conditions. Use round brackets or parentheses ( and ) to indicate the order in which logical operators should be applied.

Common Mistake! There is a very common mistake that occurs when programmers try to write boolean expressions. For example, what if we have a variable number and we want to check to see if its value is 5 or 6. In words we might say: “number equal to 5 or 6”. However, if we translate this into Python, number == 5 or 6, it will not yield correct results. The or operator must have a complete equality check on both sides. The correct way to write this is number == 5 or number == 6. Remember that both operands of or must be booleans in order to yield proper results.
In [7]:
number = 3
In [8]:
number == 5 or number == 6
Out[8]:
False
In [9]:
number == 5 or 6
Out[9]:
6

1.4 Built-in Data Structures and Sequences¶

  • List
  • Tupple
  • dict
  • set

1.4.1 List¶

A list is a sequential collection of Python data values, where each value is identified by an index. The values that make up a list are called its elements. There are several ways to create a new list. The simplest is to enclose the elements in square brackets [ and ]

In [1]:
a_list = [10, 20, 30, 40]
b_list = ["spam", "bungee", "swallow","hello","world"]

Indexing

In [2]:
a_list[1]
Out[2]:
20

Slicing

In [61]:
b_list[1:3]
Out[61]:
['bungee', 'swallow']
In [3]:
b_list[2:]
Out[3]:
['swallow', 'hello', 'world']
In [7]:
dir(b_list)
Out[7]:
['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']
Method Description
.append() Adds an element at the end of the list
.pop() Removes the element at the specified position
.remove() Removes the first item with the specified value
.index() Returns the index of the first element with the specified value
.extend() Add the elements of a list (or any iterable), to the end of the current list
.insert() Adds an element at the specified position
.count() Returns the number of elements with the specified value
.reverse() Reverses the order of the list
.sort() Sorts the list Example
.clear() Removes all the elements from the list
In [9]:
a_list = [10, 20, 30, 40]
In [10]:
a_list.append(80)
In [13]:
a_list
Out[13]:
[10, 20, 30, 40, 80]
In [14]:
a_list.pop()
Out[14]:
80
In [15]:
a_list
Out[15]:
[10, 20, 30, 40]
In [16]:
a_list.insert(1,80)
In [17]:
a_list
Out[17]:
[10, 80, 20, 30, 40]
In [18]:
a_list.index(30)
Out[18]:
3
In [77]:
a_list.extend(b_list)
In [78]:
a_list
Out[78]:
[10, 80, 20, 30, 40, 'spam', 'bungee', 'swallow', 'hello', 'world']
In [79]:
a_list.count(80)
Out[79]:
1
In [ ]:
a_list
In [ ]:
a_list[-4:]
In [ ]:
a_list
In [ ]:
# Assign value to element in list
a_list[5] = 0
In [ ]:
a_list
In [21]:
a_list = [50,10,30]
In [22]:
a_list[1] = 5
In [23]:
a_list
Out[23]:
[50, 5, 30]
In [24]:
string_a = "hello"
In [26]:
string_a[1] = "w"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [26], in <cell line: 1>()
----> 1 string_a[1] = "w"

TypeError: 'str' object does not support item assignment
In [27]:
a_list
Out[27]:
[50, 5, 30]
In [28]:
a_list.sort()
In [30]:
a_list = [50,10,30]
In [34]:
sorted(a_list, reverse= True)
Out[34]:
[50, 30, 10]
In [33]:
help(sorted)
Help on built-in function sorted in module builtins:

sorted(iterable, /, *, key=None, reverse=False)
    Return a new list containing all items from the iterable in ascending order.
    
    A custom key function can be supplied to customize the sort order, and the
    reverse flag can be set to request the result in descending order.

In [32]:
a_list
Out[32]:
[50, 10, 30]
In [85]:
help(a_list.reverse)
Help on built-in function reverse:

reverse() method of builtins.list instance
    Reverse *IN PLACE*.

In [26]:
mystring = "python is the best language for data analyst"
lst_str = mystring.split()
new_lst = lst_str[:5]
new_lst
Out[26]:
['python', 'is', 'the', 'best', 'language']
In [27]:
new_lst.sort()
print(new_lst)
['best', 'is', 'language', 'python', 'the']
QUIZ: What will the output be for the following code? ```python mystring = "python is the best language for data analyst" lst_str = mystring.split() new_lst = lst_str[:4] new_lst.sort() print(new_lst) ```



Answer A

1.4.2 Tuple¶

A tuple, like a list, is a sequence of items of any type. The printed representation of a tuple is a comma-separated sequence of values, enclosed in parentheses. Tuple is a fixed-length, immutable sequence of Python objects.

In [ ]:
tup = ("hello", "hello", 67, "Dup", 20, "Actress", "Geo")
Method Description
.count() Returns the number of times a specified value occurs in a tuple
.index() Searches the tuple for a specified value and returns the position of where it was found
In [ ]:
tup.count("hello")
In [ ]:
tup.index(20)
In [ ]:
tup[1] = 30

1.4.3 dict¶

It is a flexibly sized collection of key-value pairs, where key and value are Python objects. One approach for creating one is to use curly braces {} and colons to separate keys and values

In [35]:
d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4]}
In [36]:
d1["a"]
Out[36]:
'some value'
In [37]:
d1["b"]
Out[37]:
[1, 2, 3, 4]
Method Description
.keys() Returns a list containing the dictionary's keys
.items() Returns a list containing a tuple for each key value pair
.values() Returns a list of all the values in the dictionary
.get() Returns the value of the specified key
.update() Updates the dictionary with the specified key-value pairs
.setdefault() Returns the value of the specified key. If the key does not exist: </br>insert the key, with the specified value
.pop() Removes the element with the specified key
.popitem() Removes the last inserted key-value pair
In [38]:
d1.keys()
Out[38]:
dict_keys(['a', 'b'])
In [39]:
d1.items()
Out[39]:
dict_items([('a', 'some value'), ('b', [1, 2, 3, 4])])
In [40]:
d1.values()
Out[40]:
dict_values(['some value', [1, 2, 3, 4]])
In [41]:
d1["c"]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Input In [41], in <cell line: 1>()
----> 1 d1["c"]

KeyError: 'c'
In [45]:
d1.get("c",0)
Out[45]:
0
In [46]:
d1
Out[46]:
{'a': 'some value', 'b': [1, 2, 3, 4]}
In [44]:
d1.get("b",5)
Out[44]:
[1, 2, 3, 4]
In [ ]:
d1.setdefault("b",0)
In [47]:
d1.setdefault("c",0)
Out[47]:
0
In [48]:
d1
Out[48]:
{'a': 'some value', 'b': [1, 2, 3, 4], 'c': 0}
In [55]:
e = {"e":80}
d1.update(e)
In [56]:
d1
Out[56]:
{'a': 'some value', 'b': [1, 2, 3, 4], 'c': 0, 'e': 80}

1.4.4 set¶

A set is an unordered collection of unique elements.

In [57]:
set([2, 2, 2, 1, 3, 3])
Out[57]:
{1, 2, 3}
In [58]:
{2,3,3,3,3,2,3,8,8}
Out[58]:
{2, 3, 8}
Method Alternative syntax Description
a.add(x) N/A Add element x to the set a
a.clear() N/A Reset the set a to an empty state, discarding all of its elements
a.remove(x) N/A Remove element x from the set a
a.pop() N/A Remove an arbitrary element from the set a, raising KeyError if the set is empty
a.union(b) a | b All of the unique elements in a and b
a.update(b) a |= b Set the contents of a to be the union of the elements in a and b
a.intersection(b) a & b All of the elements in both a and b
a.intersection_update(b) a &= b Set the contents of a to be the intersection of the elements in a and b
a.difference(b) a - b The elements in a that are not in b
a.difference_update(b) a -= b Set a to the elements in a that are not in b
a.symmetric_difference(b) a ^ b All of the elements in either a or b but not both
a.symmetric_difference_update(b) a ^= b Set a to contain the elements in either a or b but not both
a.issubset(b) N/A True if the elements of a are all contained in b
a.issuperset(b) N/A True if the elements of b are all contained in a
a.isdisjoint(b) N/A True if a and b have no elements in common
In [ ]:
a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}
In [ ]:
a.union(b)
In [ ]:
a | b
In [ ]:
a.intersection(b)
In [ ]:
a & b
In [ ]:
# Sets are equal if and only if their contents are equal:
{1, 2, 3} == {3, 2, 1}  
Comparision between list, tuple, set and dictionary
Data type List Tuple Set Dictionary
Example ['apple', 'banana', 'orange'] ('Book 1', 12.99, 35, 100) {10, 20, 12, 12.5, 'Book'} {'name': 'Joe', 'age': 10}
Mutable? Mutable Immutable Mutable Mutable
Iterable? Yes ~ O(n) Yes ~ O(n) Yes ~ O(n) Yes ~ O(n)
Use case Data that needs to change Immutable data Unique items Key/Value pairs

1.5 Type casting¶

1.5.1 Casting between integer and float (int and float)¶

In [ ]:
# Init an integer
r = 1
print(r, type(r))

# Cast integer to float
t = float(r)
print(t, type(t))

# Cast float to integer
q = int(t)
print(q, type(q))
In [59]:
a = 30
In [60]:
float(a)
Out[60]:
30.0
In [65]:
a = "ab"
In [66]:
type(a)
Out[66]:
str
In [67]:
int(a)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [67], in <cell line: 1>()
----> 1 int(a)

ValueError: invalid literal for int() with base 10: 'ab'
QUIZ: What will the output be for the following code? ```python print(int(3.65)) ```

Answer B
# Function round(number, digits)

1.5.2. Casting between string and integer (str and int) and between string and float (str and float)¶

In [ ]:
s = "3.14159"
In [ ]:
fval = float(s)
ival = int(fval)
In [ ]:
# float, int -> string ok
In [ ]:
int("3")
In [ ]:
int("hello")

1.5.3. Casting among list, tuple and set¶

List and Tuple¶

In [ ]:
lst = [1, 2.5, 'stringggg']

tup = tuple(lst)

lst1 = list(tup)

print(type(lst),type(tup),type(lst1))

List and Set¶

In [ ]:
# Init a list
w = [1, 2.5, 'stringggg',]
print(w, type(w))

# Cast list to set
z = set(w)
print(z, type(z))

# Cast set to list
x = list(z)
print(x, type(x))

Tuple and Set¶

In [ ]:
# Init a tuple
w = (1, 2.5, 'stringggg')
print(w, type(w))

# Cast tuple to set
z = set(w)
print(z, type(z))

# Cast set to tuple
x = tuple(z)
print(x, type(x))
More details the built-in types (the link here)
QUIZ: What will the output be for the following code? ```python w = (1, 2.5, 'stringggg', 1, 'stringggg') print(w, type(w)) # Cast tuple to set z = set(w) print(z, type(z)) # Cast set to tuple x = tuple(z) print(x, type(x)) ```


Answer C

2. Control Flow statements in Python¶

Python has several built-in keywords for conditional logic, loops, and other standard control flow concepts found in other programming languages.

  • Conditional execution
  • for loops
  • while loops
  • pass
  • range

2.1. Conditional execution¶

We almost always need the ability to check conditions and change the behavior of the program accordingly

2.1.1 If statement with an else clause¶

if condition:
    STATEMENTS_1
else:
    STATEMENTS_2
In [68]:
x = 20
y = 10
In [70]:
if x < y:
    print("x <y")
else:
    print("x > y")
    
print("Hello world")
x > y
Hello world

if-else.JPG

2.1.2 Omitting the else clause¶

if condition:
    STATEMENTS
In [71]:
if x < y:
    print("x <y")
    
print("Hello world")
Hello world

Omit if.JPG

2.1.3 Chained conditional¶

if condition1:
    STATEMENTS_A
elif condition2:
    STATEMENTS_B
else:
    STATEMENTS_C
In [ ]:
x = 10
y = 10

if x < y:
    print("x is less than y")
elif x > y:
    print("x is greater than y")

else:
    print("x and y must be equal")
In [ ]:
grade = 85

if grade> 95:
    print("top")
elif grade >80:
    print ("better")
elif grade > 60:
    print ("good")
else: 
    print ("fail")
    
print("end block")

Chained conditional

Chained if.JPG

2.1.3 Nested conditional¶

One conditional can also be nested within another.

if condition1:
    STATEMENTS_A
else:
    if condtion2:
        STATEMENTS_B
    else:
        STATEMENTS_C
In [ ]:
if condition1:
    STATEMENTS_A
elif condtion2:
    if condtion3:
        STATEMENTS_B
    else:
        STATEMENTS_C
else:
    if condtion4:
        STATEMENTS_D
    else:
        STATEMENTS_E
    
In [ ]:
if condition1:
    STATEMENTS_A
elif condtion2:
    STATEMENTS_B
else:
    STATEMENTS_C
In [ ]:
x = 11
In [ ]:
if 0 < x: # Assume x is an int here
    if x < 10:
        print("x is a positive single digit.")

Instead of the above which uses two if statements each with a simple condition, we could make a more complex condition using the and operator. Now we only need a single if statement:

In [ ]:
if 0 < x and x < 10:
    print("x is a positive single digit.")

Nested conditionals

Nested.JPG

2.2 The forloops¶

2.2.1 Traversal and the for Loop¶

for loops are for iterating over a collection (like a list or tuple) or an iterater.</br> The standard syntax for a for loop is:

for value in collection:
    # do something with value

for loop.JPG

In [12]:
num_loop = 1

for i in [1,2,3]:
    print("*"*6)
    print("value i = {}".format(i))
    print("Loop number {}".format(num_loop))
    print("_"*6)
    num_loop = num_loop + 1    
  
    
print("--End loop FOR --")
******
value i = 1
End loop number 1
______
******
value i = 2
End loop number 2
______
******
value i = 3
End loop number 3
______
--End loop FOR --
In [22]:
num_lst = [80, 30, 100, 90]
max_value = 0
for num in num_lst:
    if num > max_value:
        max_value = num
Out[22]:
100
In [ ]:
max_value
In [19]:
num_lst = [80, 30, 90,100]

max_value = num_lst[0]


for num in num_lst:
    print("*"*6)
    print("value of max_value: {}; value of num: {}".format(max_value, num))
       
    if num > max_value:
        print("The value of max_value: {}".format(max_value))
        max_value = num
        print("Then the value of max_value changes to: {}".format(max_value))
    print("_"*6)
******
value of max_value: 80; value of num: 80
______
******
value of max_value: 80; value of num: 30
______
******
value of max_value: 80; value of num: 90
The value of max_value: 80
Then the value of max_value changes to: 90
______
******
value of max_value: 90; value of num: 100
The value of max_value: 90
Then the value of max_value changes to: 100
______
In [75]:
max_value
Out[75]:
100
In [ ]:
num_lst = [80, 30, 100, 90]


max_value = num_lst[0]


for num in num_lst:
    print("max_value là {}; num là {}".format(max_value, num))
    print("--------")
    
    if num > max_value:
        max_value = num
        print("max_value là {}".format(max_value))
QUIZ: Find min value in list
    lst = [3, 8, 200, 350, -30]
    
Answer
lst = [3, 8, 200, 350, -30]
min_value = lst[0]
for num in lst:
    if num < min_value:
        min_value = num
min_value
        

QUIZ:

Optional: Find max value in dictionary

    dic_num = {"d": 90 , "t": 30, "n": 50, "r": 10}
    
*Hint: use .keys , for, list

Click to see solution!

dic_num = {"d": 90 , "t": 30, "n": 50, "r": 10} keys_lst = list(dic_num.keys()) max_key = keys_lst[0]

for key in keys_lst: if dic_num[key]>dic_num[max_key]: max_key = key max_value = dic_num[max_key] max_value </pre> </details>
</div>

2.2.2. The Accumulator Pattern¶

One common programming “pattern” is to traverse a sequence, accumulating a value as we go, such as the sum-so-far or the maximum-so-far.

The anatomy of the accumulation pattern includes:

  • initializing an “accumulator” variable to an initial value (such as 0 if accumulating a sum)
  • iterating (e.g., traversing the items in a sequence)
  • updating the accumulator variable on each iteration (i.e., when processing each item in the sequence)
In [81]:
nums = [1, 2, 3, 4] # 5, 6, 7, 8, 9, 10
accum = 0
for w in nums:
    accum += w # equivalent accum = accum + w
In [82]:
accum 
Out[82]:
10

A for loop can be exited altogether with the break keyword

In [83]:
sequence = [1, 2, 0, 4, 6, 5, 2, 1]
total_until_5 = 0
for value in sequence:
    if value == 5:
        break
    total_until_5 += value
In [84]:
total_until_5
Out[84]:
13
QUIZ: Use for loop & Dict count occurrences of each word in given string
my_str = "Mango banAna apple pear bananA grapes strawberry blueberry apPle blueberry KiWi apple blueberry manGo strawberry"
Answer Method 1
my_str = "Mango banAna apple pear bananA grapes strawberry blueberry apPle blueberry KiWi apple blueberry manGo strawberry"
d = {}
lst_words = my_str.split()
for word in lst_words:
    w_lower = word.lower()
    if w_lower in d:
        d[w_lower] = d[w_lower] + 1
    else:
        d[w_lower] = 1   
d
Method 2
my_str = "Mango banAna apple pear bananA grapes strawberry blueberry apPle blueberry KiWi apple blueberry manGo strawberry"
d = {}
lst_words = my_str.split()
for word in lst_words:
    w_lower = word.lower()
    d[w_lower] = d.get(w_lower,0) + 1
d

2.2.3 Nested Iteration¶

Nested iteration simply means that we will place one iteration construct inside of another. We will call these two iterations the outer iteration and the inner iteration.

In [86]:
for i in range(5):
    for j in range(3):
        print(i, j)
0 0
0 1
0 2
1 0
1 1
1 2
2 0
2 1
2 2
3 0
3 1
3 2
4 0
4 1
4 2
In [85]:
range(5)
Out[85]:
range(0, 5)

2.3. while loops¶

A while loop specifies a condition and a block of code that is to be executed until the condition evaluates to False or the loop is explicitly ended with break:

while condition:
    # do something
In [6]:
x = 10
total = 0
while total <x :
    total += 0.3235
In [7]:
total
Out[7]:
10.028499999999998
In [89]:
x = 256
total = 0
while (x > 0) and (total < 500):
#     if total > 500:
#         break
    total += x
    x = x -100
In [90]:
total
Out[90]:
468
In [88]:
x = 256
total = 0
while x > 0:
    total += x
    x = x + 100
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Input In [88], in <cell line: 3>()
      3 while x > 0:
      4     total += x
----> 5     x = x + 100

KeyboardInterrupt: 
In [ ]:
x = 5

while x < 26:
    x += 5 # x = x + 5
    print (x)
Choosing between for and while

Use a for loop if you know, before you start looping, the maximum number of times that you’ll need to execute the body. For example, if you’re traversing a list of elements, you know that the maximum number of loop iterations you can possibly need is “all the elements in the list”. Or if you need to print the 12 times table, we know right away how many times the loopwill need to run.

So any problem like “iterate this weather model for 1000 cycles”, or “search this list of words”, “find all prime numbers up to 10000” suggest that a for loop is best.

By contrast, if you are required to repeat some computation until some condition is met, and you cannot calculate in advance when (of if) this will happen, you’ll need a while loop.

We call the first case definite iteration — we know ahead of time some definite bounds for what is needed. The latter case is called indefinite iteration — we’re not sure how many iterations we’ll need — we cannot even establish an upper bound!

2.4. Pass & Range¶

  • pass
  • range
  • enumerate
  • sorted
  • zip

pass

pass is the “no-op” statement in Python. It can be used in blocks where no action is to be taken

In [8]:
x = 0
if x < 0:
    print('negative!')
elif x == 0:
    # TODO: put something smart here
    pass
else:
    print('positive!')

range

The range function returns an iterator that yields a sequence of evenly spaced integers

In [33]:
num_loop = 1

for i in [1,2,3]:
    print("*"*6)
    print("value i = {}".format(i))
    print("Loop number {}".format(num_loop))
    print("_"*6)
    num_loop = num_loop + 1
******
value i = 1
Loop number 1
______
******
value i = 2
Loop number 2
______
******
value i = 3
Loop number 3
______
In [32]:
num_loop = 1

for i in range(1,4):
    print("The value of i = {}".format(i))
    print("Number loop: {}".format(num_loop))
    num_loop += 1  
The value of i = 1
Number loop: 1
The value of i = 2
Number loop: 2
The value of i = 3
Number loop: 3
In [14]:
list(range(1,4))
Out[14]:
[1, 2, 3]
In [ ]:
range(0, 10)

zip

“pairs” up the elements of a number of lists, tuples, or other sequences to create a list of tuples

In [18]:
seq1 = ['foo', 'bar', 'baz']
seq2 = ['one', 'two', 'three']
zipped = zip(seq1, seq2)
zipped
Out[18]:
<zip at 0x2271136b300>
In [19]:
for i in zipped:
    print(i)
    print(type(i))
('foo', 'one')
<class 'tuple'>
('bar', 'two')
<class 'tuple'>
('baz', 'three')
<class 'tuple'>

enumerate

Enumerate, which returns a sequence of (i, value) tuples

for index, value in enumerate(collection):
    # do something with value
In [20]:
a_list = [1,2,3]
for index, value in enumerate(a_list):
    print(index,"--",value)
0 -- 1
1 -- 2
2 -- 3

sorted

Function returns a new sorted list from the elements of any sequence

sorted(iterable, key=None, reverse=False)
In [26]:
lst = [80,60,30,5]
sorted(lst)
Out[26]:
[5, 30, 60, 80]
In [23]:
lst
Out[23]:
[80, 60, 30, 5]
In [24]:
lst.sort()
In [25]:
lst
Out[25]:
[5, 30, 60, 80]

2.5 List, Set, and Dict Comprehensions¶

List comprehensions are one of the most-loved Python language features. They allow you to concisely form a new list by filtering the elements of a collection, transforming the elements passing the filter in one concise expression.

With if

[expr for val in collection if condition]

Equivalent

result = []
for val in collection:
    if condition:
        # expr = val do something
        result.append(expr1)

In general, with if&else

[expr1 if condition else expr2 for value incollection]
In [27]:
# Add 5 each number of lst_nums
lst_nums = [80,60,30,5]

result = []

for num in lst_nums:
    value = num + 5
    result.append(value)
    
result
Out[27]:
[85, 65, 35, 10]
In [28]:
lst_nums = [80,60,30,5]
[num + 5 for num in lst_nums]
Out[28]:
[85, 65, 35, 10]
In [40]:
# Add 5 each number of lst_nums if number <50 else = number
lst_nums = [80,60,30,5]
[num + 5 if num <50 else num for num in lst_nums]
Out[40]:
[80, 60, 35, 10]
QUIZ: Write list comprehensions equivalent the following codes
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
result = []
for item in strings:
    if len(item)>2:
        result.append(item.upper())
result
    
Answer
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
[x.upper() for x in strings if len(x) > 2]
        

Set and dict comprehensions

dict_comp = {key-expr : value-expr for value in collection if condition}


set_comp = {expr for value in collection if condition}
In [42]:
lst = ["a","b","c","d"]
dic_num = {val: i for i,val in enumerate(lst)}
dic_num
Out[42]:
{'a': 0, 'b': 1, 'c': 2, 'd': 3}
In [44]:
lst = ["a","b","c","d","a","c"]
set_comp = {val for val in lst}
set_comp
Out[44]:
{'a', 'b', 'c', 'd'}

Nested list comprehensions

In [34]:
all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
            ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]

result = [name for names in all_data for name in names if name.count('e') >= 2]
result
Out[34]:
['Steven']

3. Functions and Files¶

3.1 Functions¶

  • Function Definition
  • Function Parameters
  • Returning a value from a function
  • Variables, parameters are local; global variables
  • Functions can call other functions
  • Flow of execution
  • Anonymous (Lambda) Functions
  • Generators

<DRY>-DON’T REPEAT YOURSELF

In [38]:
def add_5 (lst_nums):
    return [num + 5 for num in lst_nums]
In [37]:
lst_nums = [80,60,30,5]
add_5(lst_nums)
Out[37]:
[85, 65, 35, 10]
In [39]:
lst_nums_b = [15,35,25,10]
add_5(lst_nums_b)
Out[39]:
[20, 40, 30, 15]

3.1.1 Function Definition¶

A function is a named sequence of statements that belong together. Their primary purpose is to help us organize programs into chunks that match how we think about the problem.

Two kinds of functions in Python.

  • Built-in functions that are provided as part of Python: print(), input(), type(), float(), int() ...
  • Functions that we define ourselves and then use.

The syntax for a function definition is:

def NAME (PARAMETERS):
    STATEMENTS

Defining a new function does not make the function run. To execute the function, we need a **function call. (function invocation**)The way to invoke a function is to refer to it by name, followed by parentheses.*

In [40]:
def greet ():
    print("Xin chào bạn Quang")
    print("Rất vui được gặp bạn")

print("Hello")
Hello
In [41]:
greet()
Xin chào bạn Quang
Rất vui được gặp bạn
In [42]:
for i in range(5):
    greet()
    print("-----")
Xin chào bạn Quang
Rất vui được gặp bạn
-----
Xin chào bạn Quang
Rất vui được gặp bạn
-----
Xin chào bạn Quang
Rất vui được gặp bạn
-----
Xin chào bạn Quang
Rất vui được gặp bạn
-----
Xin chào bạn Quang
Rất vui được gặp bạn
-----

3.1.2 Function Parameters¶

In [53]:
def add_value (lst_nums,value):
    return [num + value for num in lst_nums]
In [55]:
lst_nums = [80,60,30,5]
add_value(lst_nums,10)
Out[55]:
[90, 70, 40, 15]

With parameters, functions are even more powerful, because they can do pretty much the same thing on each invocation, but not exactly the same thing.

  • Formal parameters or parameter names
  • Arguments or actual parameters or parameter values are certain information a function needs to do its work If functions have more than parameters,they are separated by commas.
In [45]:
def greet (name):
    print("Xin chào bạn " + name)
    print("Rất vui được gặp bạn")

greet("Nam")
Xin chào bạn Nam
Rất vui được gặp bạn

3.1.3 Returning a value from a function¶

Not only can you pass a parameter value into a function, a function can also produce a value.

  • Functions that return values are sometimes called fruitful functions.
    • The return statement is followed by an expression which is evaluated. Its result is returned to the caller as the “fruit” of calling this function
  • Function that doesn’t return a value is called void(non-fruiful )-function

A return statement, once executed, immediately terminates execution of a function, even if it is not the last statement in the function.

In [46]:
def square(number):
    return number*number
    print("Hello world")

square (8)
Out[46]:
64
In [56]:
def add_value (lst_nums,value=5):
    return [num + value for num in lst_nums]
In [58]:
lst_nums = [80,60,30,5]
add_value(lst_nums,3)
Out[58]:
[83, 63, 33, 8]
In [59]:
def square(number = 3):
    return number*number

3.1.3 Variables, parameters are local; global variables¶

Each call of the function creates new local variables, and their lifetimes expire when the function returns to the caller.

  • Local variable and parameters only exists inside the function and you cannot use it outside
  • Variable names that are at the top-level, not inside any function definition, are called global variable.
In [62]:
def add_five(num):
    add_num = 5
    result = num + add_num
    return result

a = add_five(5)
add_num = 8
print(a)
print(a,"&", add_num)
a = add_five(5)
print(a)
10
10 & 8
10

3.1.3 Functions can call other functions¶

One of the most important ways that computer programmers take a large problem and break it down into a group of smaller problems.

  • Each of the functions we write can be used and called from other functions we write.
  • Functional decomposition is process of breaking a problem into smaller subproblems
In [63]:
def square(x):
    y = x * x
    return y


def sum_of_squares(x,y,z):
    a = square(x)
    b = square(y)
    c = square(z)
    return a + b + c
In [64]:
sum_of_squares(1,2,3)
Out[64]:
14

3.1.4 Flow of execution¶

When you are working with functions it is really important to know the order in which statements are executed.

  • Statements are executed one at a time, in order, from top to bottom.
  • Statements inside the function are not executed until the function is called.
  • When Function calls instead of going to the next statement, the flow jumps to the first line of the called function, executes all the statements there, and then comes back to pick up where it left off.
In [66]:
def add_five(num):
    add_num = 5
    result = num + add_num
    return result
    print("hello")


for i in range(5):
    print(i)
add_five(5)
0
1
2
3
4
Out[66]:
10
QUIZ: Write a function

Write a function named number_test that takes a number as input. If the number is greater than 8, the function should return “Greater than 8.” If the number is less than 8, the function should return “Less than 8.” If the number is equal to 8, the function should return “Equal to 8.”

Answer
def number_test(num):
    if num> 8:
        return "Greater than 10."
    elif num == 8:
        return "Equal to 8."
    else:
        return "Less than 8."
    

3.1.5 Anonymous (Lambda) Functions¶

lambda.gif The syntax for lambda function:

lambda arguments: expression
  • The syntax of a lambda expression is the word “lambda” followed by parameter names, separated by commas but not inside (parentheses), followed by a colon and then an expression
  • A function, whether named or anonymous, can be called by placing parentheses () after it. In this case, because there is one parameter, there is one value in parentheses. This works the same way for the named function and the anonymous function produced by the lambda expression. The lambda expression had to go in parentheses just for the purposes of grouping all its contents together.
In [67]:
def add_five(num):
    add_num = 5
    result = num + add_num
    return result

add_five(5)
Out[67]:
10
In [68]:
a = lambda x : x+5
In [69]:
a(8)
Out[69]:
13
In [70]:
add_five(8)
Out[70]:
13
QUIZ: Write a lamda function

Write a lambda function that takes a string as input and then returns the uppercase first character of the string

Answer
a = lambda string : string[0].upper()
    

3.1.6 Currying: Partial Argument Application¶

New functions from existing ones by partial argument application.

In [ ]:
def add_numbers(x, y):
    return x + y
In [ ]:
add_five = lambda y : add_numbers(5,y)
In [ ]:
add_five(6)

3.1.7 Generators¶

  • An iterator is any object that will yield objects to the Python interpreter when used in a context like a for loop.
  • A generator is a concise way to construct a new iterable object. Generators return a sequence of multiple results lazily, pausing after each one until the next one is requested. To create a generator, use the yield keyword instead of return in a function
In [45]:
def squares(n=10):
    print('Generating squares from 1 to {0}'.format(n ** 2))
    for i in range(1, n + 1):
        yield i ** 2
In [46]:
gen = squares()
In [47]:
gen
Out[47]:
<generator object squares at 0x0000027908F33200>
In [48]:
list(gen)
Generating squares from 1 to 100
Out[48]:
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

Another even more concise way to make a generator is by using a generator expression.

In [ ]:
gen = (x ** 2 for x in range(100))
In [ ]:
gen

3.2 Files and the Operating System¶

Absolute vs. Relative Paths There are two ways to specify a file path:
  • An absolute path, which always begins with the root folder
  • A relative path, which is relative to the program’s cwd

There are also the dot (.) and dot-dot (..) folders. These are not real folders but special names that you can use in a path. A single period (.) for a folder name is shorthand for “this directory.” Two periods (..) means “the parent folder.”

Figure shows an example of some folders and files. When the cwd is set to C:\bacon, the relative paths for the other folders and files are set as they are in the figure.

The .\ at the start of a relative path is optional. For example, .\spam.txt and spam.txt refer to the same file.

absolute%20paths.png

To open a file for reading or writing, use the built-in open function with either a relative or absolute file path. By default, the file is opened in read-only mode 'r'.

Mode Description
r Read-only mode
w Write-only mode; creates a new file (erasing the data for any file with the same name)
x Write-only mode; creates a new file, but fails if the file path already exists
a Append to existing file (create the file if it does not already exist)
r+ Read and write
b Add to mode for binary files (i.e., 'rb' or 'wb')
t Text mode for files (automatically decoding bytes to Unicode). </br>This is the default if not specified. Add t to other modes to use this (i.e., 'rt' or 'xt')

Note: When you use open to create file objects, it is important to explicitly close the file when you are finished with it.

In [ ]:
path = 'data/travel_plans.txt'
fl = open(path)
In [ ]:
fl.readlines()
In [71]:
'D:\GitHub\sword\pylv1_mcivn\data\travel_plans.txt'

path = r'data/travel_plans.txt'
fl = open(path)
for line in fl:
    print(line)
fl.close()
This summer I will be travelling.

I will go to...

Italy: Rome

Greece: Athens

England: London, Manchester

France: Paris, Nice, Lyon

Spain: Madrid, Barcelona, Granada

Austria: Vienna

I will probably not even want to come back!

However, I wonder how I will get by with all the different languages.

I only know English!
Method Use Explanation
write fl.write(astring) Add a string to the end of the file. fl must refer to a </br>file that has been opened for writing.
read(n) fl.read() Read and return a string of n characters, or the entire file </br> as a single string if n is not provided.
readline(n) fl.readline() Read and return the next line of the file with all text up to and </br> including the newline character. If nis provided as a parameter,</br> then only n characters will be returned if the line is longer than n.
readlines(n) fl.readlines() Returns a list of strings, each representing a single line of the file.</br> If n is not provided then all lines of the file are returned. </br> If n is provided then n characters are read but n is rounded up so that an entire line is returned.
In [72]:
path = r'data/travel_plans.txt'
fl = open(path)
fl.read()
Out[72]:
'This summer I will be travelling.\nI will go to...\nItaly: Rome\nGreece: Athens\nEngland: London, Manchester\nFrance: Paris, Nice, Lyon\nSpain: Madrid, Barcelona, Granada\nAustria: Vienna\nI will probably not even want to come back!\nHowever, I wonder how I will get by with all the different languages.\nI only know English!'
In [73]:
path = r'data/travel_plans.txt'
fl = open(path)
fl.readline()
Out[73]:
'This summer I will be travelling.\n'
In [74]:
path = r'data/travel_plans.txt'
fl = open(path)
fl.readlines()
Out[74]:
['This summer I will be travelling.\n',
 'I will go to...\n',
 'Italy: Rome\n',
 'Greece: Athens\n',
 'England: London, Manchester\n',
 'France: Paris, Nice, Lyon\n',
 'Spain: Madrid, Barcelona, Granada\n',
 'Austria: Vienna\n',
 'I will probably not even want to come back!\n',
 'However, I wonder how I will get by with all the different languages.\n',
 'I only know English!']
In [75]:
fl.write("Hello world")
---------------------------------------------------------------------------
UnsupportedOperation                      Traceback (most recent call last)
Input In [75], in <cell line: 1>()
----> 1 fl.write("Hello world")

UnsupportedOperation: not writable

The Python with statement makes using context managers easy. The general form of a with statement is:

with <create some object that understands context> as <some name>:
    do some stuff with the object
    ...
In [78]:
path = r'data/travel_plans.txt'
with open(path) as fl:
    for line in fl.readlines():
        print(line)
This summer I will be travelling.

I will go to...

Italy: Rome

Greece: Athens

England: London, Manchester

France: Paris, Nice, Lyon

Spain: Madrid, Barcelona, Granada

Austria: Vienna

I will probably not even want to come back!

However, I wonder how I will get by with all the different languages.

I only know English!

3.2.1 Recipe for Reading and Processing a File¶

Here’s a foolproof recipe for processing the contents of a text file.

    1. Open the file using with and open.
    1. Use .readlines() to get a list of the lines of text in the file.
    1. Use a for loop to iterate through the strings in the list, each being one line from the file. On each iteration, process that line of text
    1. When you are done extracting data from the file, continue writing your code outside of the indentation. Using with will automatically close the file once the program exits the with block.
fname = "yourfile.txt"
with open(fname, 'r') as fileref:         # step 1
    lines = fileref.readlines()           # step 2
    for lin in lines:                     # step 3
        #some code that references the variable line
#some other code not relying on fileref   # step 4

QUIZ:

Optional: Extract list of cities and list of countries in file

    path = 'data/travel_plans.txt'
    
*Hint: use str.split , str.strip , for, with open(path) as fl,fl.readlines, list.append, list.extend

lst_country = ['Italy', 'Greece', 'England', 'France', 'Spain', 'Austria']
lst_city = ['Rome', 'Athens', 'London', ' Manchester', 'Paris', ' Nice', ' Lyon', 'Madrid', ' Barcelona', ' Granada', 'Vienna']
    

Click to see solution!
  
path = 'data/travel_plans.txt'
with open(path) as fl:
    lst_country = []
    lst_city = []
    for line in fl.readlines():
        if ":" in line:
            a = line.split(":")
            lst_country.append(a[0])
            lst_city.extend(a[1].strip().split(","))
    

</div>

3.2.2 Bytes and Unicode with Files¶

The default behavior for Python files (whether readable or writable) is text mode, which means that you intend to work with Python strings (i.e., Unicode). If you find yourself regularly doing data analysis on non-ASCII text data, mastering Python’s Unicode functionality will prove valuable. See Python’s online documentation for much more.

3.2.3 Errors and Exception Handling¶

Bugs & Debugging¶

Programming is a complex process. Since it is done by human beings, errors may often occur. Programming errors are called bugs and the process of tracking them down and correcting them is called debugging. Types bug:

  • Syntax error. A syntax error occurs when a programmer writes an incorrect line of code (the structure of a program and the rules about that structure).
  • Runtime error. Runtime errors are also called exceptions because they usually indicate that something exceptional (and bad) has happened
  • Semantic error (logic error.) The computer will not generate any error messages. However, your program will not do the right thing.

Syntax error

In [82]:
print"Hello world"
  Input In [82]
    print"Hello world"
         ^
SyntaxError: invalid syntax

Runtime error

In [84]:
int("a")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [84], in <cell line: 1>()
----> 1 int("a")

ValueError: invalid literal for int() with base 10: 'a'

Semantic error (logic error)

In [85]:
a = lambda x : x + 8
In [86]:
a (5)
Out[86]:
13

How to Avoid Debugging

  • Understand the Problem
  • Start Small
  • Keep Improving It

What is an exception?¶

An exception is a signal that a condition has occurred that can’t be easily handled using the normal flow-of-control of a Python program. Exceptions are often defined as being “errors” but this is not always the case.

Raising and Catching Errors¶

The try/except control structure provides a way to process a run-time error and continue on with program execution With try/except, you tell the python interpreter:

  • Try to execute a block of code, the “try” clause.

    • If the whole block of code executes without any run-time errors, just carry on with the rest of the program after the try/except statement.
  • If a run-time error does occur during execution of the block of code:

    • skip the rest of that block of code (but don’t exit the whole program)
    • execute a block of code in the “except” clause
    • then carry on with the rest of the program after the try/except statement

The syntax for try & except function:

try:
   <Some Code.... >
except <ErrorType>:
   <exception handler code block>
finally:
    <Some code .....(always executed)>
In [87]:
items = ['a', 'b']
third = items[2]
print("hello world")
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [87], in <cell line: 2>()
      1 items = ['a', 'b']
----> 2 third = items[2]
      3 print("hello world")

IndexError: list index out of range
In [88]:
items = ['a', 'b']
try:
    third = items[2]
except IndexError:
    print("I can't do it")
    
print("hello world")
I can't do it
hello world
In [89]:
for i in ["1",'2',"a"]:
    try:
        print(int(i))
    except:
        print("Toi khong the lam duoc")
1
2
Toi khong the lam duoc

The input() function allows user input.

input(prompt) -> return string
In [90]:
a = input("Nhap diem cua ban")
Nhap diem cua ban5
In [91]:
print(a)
5
In [ ]:
int(a)
In [92]:
from optional_1 import score
In [94]:
score()
Nhập điểm của bạn (Điểm phải là số và trong khoảng từ 0 đến 100): -10
Điểm của bạn phải nằm trong khoảng từ 0 đến 100
Nhập điểm của bạn (Điểm phải là số và trong khoảng từ 0 đến 100): 150
Điểm của bạn phải nằm trong khoảng từ 0 đến 100
Nhập điểm của bạn (Điểm phải là số và trong khoảng từ 0 đến 100): a
Điểm của bạn phải là số!
Nhập điểm của bạn (Điểm phải là số và trong khoảng từ 0 đến 100): 95
Bạn đã đỗ

Hint: use some function

while, try except, break, if, True, input

4. Homework assignment¶

4.1 Excercise 1:¶

Bạn đầu tư 100 triệu tiền cổ phiếu Vietcombank với mức giá 40.000đ/1 cổ phiếu (1/1/2019), trong đó số tiền vay là 30 triệu. Sau 2 năm (1/1/2021) giá trị 1 cổ phiếu VCB là 80.000đ/1 cổ phiếu, tương đương giá trị là 200 triệu.

Giả định bạn đã bán hết số lượng cổ phiếu ngày 1/1/2021 với mức phí 0.5%, và lãi vay mua chứng khoán là 10%/năm. Thì số tiền thu về là bao nhiêu?

Lãi vay chứng khoán không đổi trong 2 năm nắm giữ chứng khoán mức phí giao dịch chứng khoán tính trên tổng giá trị bán đi

4.2 Excercise 2:¶

stocks = ['VCB', 'HPG', 'FPT', 'TCB']

prices = [80, 40, 100, 50]

  • giả định thứ tự của các số trong price chính là giá của các cổ phiếu tương ứng ở stocks
  • gợi ý: dùng hàm index

Câu hỏi: Tìm cổ phiếu có mức giá cao nhất?

4.3 Excercise 3:¶

Trình bày 1 case study bạn đang gặp phải và có thể dùng python để xử lý được

4.4 Excercise 4:¶

Sinh viên nhận được điểm Pass/Fail. Nếu điểm >= 65 (trên thang 100) đạt "Pass". Đối với mức điểm thấp hơn sẽ được coi là "Fail". Bên cạnh đó, mức điểm > 95 là "Top Score".

Viết function sao cho:

print(exam_grade(65)) # Should be Pass

print(exam_grade(55)) # Should be Fail

print(exam_grade(100)) # Should be Top Score

4.5 Excercise 5:¶

Viết function sao cho:

print(greeting()) # Should be welcome

print(greeting('Peter')) # Should be Welcome back Peter!

print(greeting('Tên bất kì')) # Should be "Hello there, " + name

4.6 Excercise 6:¶

Write a function which has input as an integer and return the sum from 1 to that integer. Example: my_sum_function_1(8) = 36

Write a function which has input as an integer and return the sum from 1 to that integer but only even numbers. Example: my_sum_function_2(10) = 30

4.7 Excercise 7:¶

Write a function to check input number is a prime number or a composite number. If it is a composite number, print all of its divisor.

Example: check_prime_composite(10) -> 2,5 composite

Note:</br> A prime number is a whole number greater than 1 whose only factors are 1 and itself.

4.8 Excercise 8:¶

Write a function to calculate the factorial of an integer.

Note:

Example: 5! = 5\4*3*2*1*

4.9 Excercise 9:¶

Generate randomly 3 numbers a, b, cto form a quadratic equation. Write a function to solve this equation.

$$ a{x^2} + bx + c = 0 $$

Hint: Solution quadratic equation

  1. Check a if a == 0 -> None
  2. Calculate delta: $$ delta = b^2 -4ac $$
  3. Check if delta < 0 -> No solution

  4. Check if delta == 0 $$ x = \frac{-b}{2a} $$

  5. Check if delta > 0 -> Calculation x1, x2 $$ x_1 = \frac{-b - \sqrt{delta}}{2a} $$

$$ x_2 = \frac{-b + \sqrt{delta}}{2a} $$